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ABSTRACT 

Anisotropic measurements of the Baryon Acoustic Oscillation (BAO) feature within a galaxy 
survey enable joint inference about the Hubble parameter H[z) and angular diameter distance 
Da{z). These measurements are typically obtained from moments of the measured 2-point 
clustering statistics, with respect to the cosine of the angle to the line of sight /i. The position 
of the BAO features in each moment depends on a combination of Da{z) and H{z), and 
measuring the positions in two or more moments breaks this parameter degeneracy. We derive 
analytic formulae for the parameter combinations measured from moments given by Legendre 
polynomials, power laws and top-hat Wedges in p, showing explicitly what is being measured 
by each in real-space for both the correlation function and power spectrum, and in redshift- 
space for the power spectrum. The large volume covered by modern galaxy samples means 
that the cotTelation function can be well approximated as having no cori'elations at different 
/r on the BAO scale, and that the errors on this scale are approximately independent of ft. 
Using these approximations, we derive the information content of various moments. We show 
that measurements made using either the monopole and quadrupole, or the monopole and 
power-law moment, are optimal for anisotropic BAO measurements, in that they contain all of 
the available information using two moments, the minimal number required to measure both 
H{z) and Da{z). We test our predictions using 600 mock galaxy samples, matched to the 
SDSS-111 Baryon Oscillation Spectroscopic Survey CMASS sample, hnding a good match to 
our analytic predictions. Our results should enable the optimal extraction of information from 
future galaxy surveys such as eBOSS, DESI and Euclid. 
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1 INTRODUCTION 


The clustering of galaxies contains the imprint of the BAO scale, at 
a fixed co-moving distance ~150Mpc (see, e.g., |Eisenstein|2005| 
for a review). The apparent location of the position of the feature 
along the line of sight (los) depends on the value of the Hubble 
parameter, H{z), and its apparent location transverse to the los de¬ 
pends on the angular diameter distance, Da{z). Thus, measure¬ 
ments of the clustering of galaxies along and transverse to the los 
allows simultaneous measurement of Da{z) and H{z) (see, e.g., 
|Hu & Haim a n|2003| and|Pad mana bhan & Whit e|2008T ). 

The Sloan Digital Sky Survey (SDSS; [York et al.|[2000l l III 
( [Eisenstein et al.||2011 Baryon Oscillation Spectroscopic Survey 
(BOSS; Dawson et al. |2013[l has provided galaxy samples large 


enough to robustly measure BAO scale information along and 
transverse to the los and thus independently measure Da{z) and 
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H{z). Two methods have been applied to BOSS data that isolate 
the BAO information: “Wedges” (|Kazin et al.||2012| |20T3)( and 
“Multipoles” ( |Xu et al T]|2013^ and r esults using both methodolo¬ 
gies are presented in Anderson et al.|p()14). 


As measurements become statistically more precise, there is 
an increased pressure on the analysis pipeline to ensure the extrac¬ 
tion of information is robust. The elements of the pipeline requiring 
careful consideration include the models to be fitted to the data, the 
statistical procedure to be applied, accurate estimation of system¬ 
atic errors, and a precise knowledge of what is actually being mea¬ 
sured. In this paper, we focus on the latter issue for anisotropic BAO 
measurements, considering the information content of moments of 
2-point statistics. Recently, studies such as Taruya et al.| (|2011|l; 
|Font-Ribera et al.|j2014t ; |Blazek et al.]j2014 have also studied the 
information content of anisotropic clustering measurements. In our 
study, we build on these results by focusing purely on the Da{z) 
and H{z) information that can be measured via the BAO posi¬ 
tion, thereby enabling an alternative and simplified analytic treat- 
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merit. Further, we focus primarily on post-‘reconstruction’ cluster¬ 
ing measurements ( [Eisenstein et al.|[2007^ , where the large-scale 
clustering amplitudes are expected to be isotropic. In this case, we 
show that moments based on polynomials of the cosine of the angle 
to the los (/r) are complete for any non-degenerate set of two mo¬ 
ments that includes zero and second order terms. We then compare 
the precision of Da{z) and H{z) measurements one obtains using 
the Wedges and Multipoles methodology, both based on analytical 
predictions and empirical measurements. 

Our paper is structured as follows: After developing general 
formulae in Section 2, we assume that information is equally dis¬ 
tributed with respect to fj, = cos(0ios), equivalent to a spheri¬ 
cally symmetric distribution and matching empirical results, and 
in Section 3 we predict the variance and covariance expected on 
Da{z) and H{z) measurements for two simple combinations of 
measurements: one in which a combination of the spherically av¬ 
eraged clustering and clustering averaged over a yP window are 
used, and another using Wedges split at an arbitrary fj,d- In Section 
4, we describe how the BAO scale can be fitted using the different 
methodologies. In Section 5, we measure the BAO scale using 600 
mock BOSS samples and compare the results obtained using each 
methodology, and to the results of |Anderson et al.] ( |2014^. Where 
applicable, w e assume the same fiducial cosmology as in|Anderson| 


et al. 


l 2014t : = 0.274, h = 0.7, = 0.0224. 


2 THE ANISOTROPIC BAO SIGNAL 


In this section we describe our formalism for considering measure¬ 
ments of the projected BAO scale including an isotropic dilation, 
and the anisotropic Alcock-Paczynski effect jAlcock & Paczynski| 
\\919\ . We present our formalism in configuration space, but our 
derivations are equally valid in Fourier space and therefore appli¬ 
cable to P{k, fj,) measurements. 

The observed distance between two galaxies r defined assum¬ 
ing a fiducial or reference cosmological model, and the observed 
cosine of the angle the pair makes with respect to the line-of-sight 
(los) ^ are given by 


2 


r 


2 , 2 



( 1 ) 


where r| | is the los separation and rj_ is the transverse separation. 
The estimate of these separations is dependent on the assumed cos¬ 
mology. Defining 


rr|| — H / P {z)true] Cy.±_ — Dyi.true/DA.fid, (2) 

the true separation, r', is given by 

r' = ar = (3) 

We can re-arrange the above equations to express the stretch as a 
function of the angle to the line of sight: 

Q(/i) = + (1 - (4) 

Assuming symmetry around /r = 0, we can consider any mo¬ 
ment of the 2-point clustering signal as an integral over measure¬ 
ments made along different directions with given weighting. For 
the correlation function we can write 

iF{r)= [ (5) 

Jo 


where F’(/r) gives the relative weight of each direction to the mo¬ 
ment. For the monopole of the correlation function for ex¬ 

ample, F’(/r) = 1. In this paper we only consider functions F{^) 
that are normalised, that is for which F{^) d/i = 1 . 

In real-space, the correlation function for galaxies in a thin 
slice in /r can be written /a{fj,)), where A(^) alters the 

amplitude, but not the shape or BAO position. If RSD have been 
removed during a “reconstruction” ^Eisenstein et al.||2007^ step, 
this also holds. Pre-reconstruction in redshift space, we need to ad¬ 
just the template to be fitted to allow for correlation function shape 
changes ( |Jeong et al.||2014] l. If a 7 ^ 1, Eq. describes a shift 
in the mean position of the BAO in the moment, which we denote 
a_F, together with a “broadening” of the BAO bump, which is now 
the superposition of a(p), which varies as given in Eq.[^ For cos¬ 
mological models close to the fiducial cosmology used to calculate 
the correlation function, the broadening is small and is degenerate 
with the non-linear BAO damping. Consequently information from 
the BAO feature width is commonly neglected, with the primary 
measurement being the BAO position ap- Information from the 
broadening was included in the anisotropic BAO measurements of 
[Anderson et al.| ( |2014t , where models of the moments were calcu¬ 
lated by integrating directly over ^(r, p). The additional constraints 
available from the observed shape of the BAO feature mean that the 
contours from any single moment in a^| and are closed, but this 
closure of the contours is not important when fitting to multiple mo¬ 
ments, which generally break this degeneracy much more strongly. 

We seek to express the expectation for the measured stretch, 
ap, determined from a moment of the 2-point clustering signal 
(Cf; Eq. 13, in terms of the radial and transverse stretch through 
the expression for a(p) given by Eq. Following the arguments 
above, we assume the information on Q:(p) is separable from the 
overall shape of the clustering signal. This is equivalent to the mod¬ 
eling used in, e.g., [Anderson et al.j ( [2014^ BAO fits to the mea¬ 
sured F{k), where the model consists of a BAO feature and nui¬ 
sance parameters describing the overall shape of F{k), and simi¬ 
lar to the modeling used to fit ^(s) in the same study. When this is 
case, the maximum likelihood a{fj.) determined from any measured 
^meas (p) must be independent of any other parameters. We further 
assume that information in different p bins is independent and dis¬ 
tributed equally (which we justify empirically in Section 13 , and 
thus the maximum likelihood stretch m in any p bin i are indepen¬ 
dent. This combination of assumptions implies that, for positive- 
definite windows F{fj,), the maximum likelihood ap obtained from 
EiF’(pi)^(pi) Api is the same as the weighted sum of individual 
maximum likelihood ai, SiF’(pi)Q;i Api, which is 

ft 1 

qf = / + (1 -2 (6) 

for infinitesimal bins in p and the corresponding ^p clustering mea¬ 
surements defined by Eq.[^ 

One can fit to a^(p) rather than a(p). For moments ^p, this 
is equivalent to measuring the weighted average of c? (p) over the 
window whose expected maximum likelihood value we ex¬ 

press as {a%) and is somewhat simpler to interpret for some func¬ 
tions F{fi). In this case, we have that 

{a%) = f dpF(p) [p^afi -f (1 - p^)al] . (7) 

Jo 

In the following we consider both approaches, fitting for either ap 
or {a'jp). Note that using a positive-definite function has the added 
advantage that, in real-space or post-reconstmction, the moments 
have the same shape as the linear 2-point clustering measurement 
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to first-order when a| I = q:_l = 1. Thus they will all display a clear 
BAO feature that can be easily fitted. 

Any single measurement of qf or (qf) from a moment of 
the correlation function or power spectrum will result in a degen¬ 
erate measurement of a|[ and Expanding around the best-fit 
solution to first order, we can fit the degeneracy direction showing 
that the primary measurement of aF or {a%) results in the same 
degeneracy, with a form 


m + n m n 

Up = ail ax, 


( 8 ) 


where 


dap 

dot\\ 

dap 

9ax 


= f 

CK11 ,q:_l =1 *^0 

= / dnF{n){l - f?). 

a| I = 1 0 


(9) 

( 10 ) 


The factor m -I- n on the left-hand side of Eq. § renormalises ap 
to the correct units. 

In the following we consider particular forms for the function 
F{fi.). The analysis should be valid for both power spechum and 
correlation function analyses. 


2.1 Fitting the monopole 


For the monopole in real-space, F{p) = 1, and Eqns. § & @ 
give that m = | and n = |, and one recovers the well-known 

1 2 

result that BAO fits to the monopole constrain ap = ai^a^, 
whose corresponding distance is commonly called Dy- Note that, 
for measurements of the dilation scale parameterised by {a%), the 
fit constrains a linear combination of q^| and a\ 


(of) = (11) 

For the monopole of the power spectrum in redshift-space. 

Fir) + /I-,, 

- TTW^W' 

including the increase in clustering amplitude driven by the 
Redshift-Space Distortions (Kaiser 1987 1 . Here /3 = f /b, where / 
is the logarithmic derivative of the linear growth rate with respect 
to the scale factor, and 6 is a linear deterministic bias. Substituting 
this into Eqns. ([^ & i lOi and defining A = 1 -|- |/3 -|- |/3^, gives 
that 


m = 


1 

A 



w 

5 


+ 




2 4^ 2^\ 

3 15 35 J ' 

(13) 


For the SDSS-III BOSS (|Dawson et al.||2013f CMASS galaxies, 
ISamushia et al.| ( [2014^ measured j3 = 0.34, which translates to 
m = 0.49 and n = 0.76 suggesting that, to first order, the BAO- 
scale constraints from the monopole power spectrum measurement 
depend on qf = As expected, an increase in the clus¬ 

tering strength along the los leads to an increased dependence on 
a|l in the resulting measurement. 

Post-reconstruction, it is standard to “approximately remove” 
the RSD based on the estimate of the potential obtained, leaving a 
clustering signal whose amplitude is approximately independent of 
p (e.g., [Padmanabhan et al.|2012[ [Burden et al.|2014^ . Spherical 
averaging to give the monopole means that there is no /3-dependent 
term, and the dependence of the monopole will revert to the real- 
space value. Note that in this case, or in real-space, all equations 
are valid for both the correlation function and the power spectrum. 
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2.2 Fitting power-law moments 


The Legendre polynomials form an orthogonal basis and are the 
standard approach to measuring anisotropic clustering. However, 
using such bases, we can have F{p) < 0 for some p, and conse¬ 
quently, the recovered clustering signal cannot be considered as a 
sum of the clustering signals in different directions (Eq.|^no longer 
holds). The interpretation of these moments is therefore compli¬ 
cated as the BAO information is not compressible into a single 
stretch value. 

This is not true if we instead consider the power law moments 
from which the multipoles are composed. For a power law moment 
of the power spectrum in redshift-space. 




p^{i + dpy 


and 


{p + l)“i -I- 2/3{p + 3)“i -I- /32(p -I- 5)“i ’ 


1 

2P 

-1- -1- 

P -1- 3 

p -1- 5 

1 

2/3-1 

p -1- 1 

p -1- 3 


P + 1' 

- 2/3 


P^ 


p+5 p+7 


(14) 

(15) 

(16) 


When /3 = 0, or post-reconstruction with RSD removal, this re- 

P + 1 2 

duces to constraining ap = a^f^ af'^ , which is valid for both 
the correlation function and power spectrum. 

Note that fits to {a%) constrain a linear combination of 
and a\ 

{of) =ma^t^+na\, (17) 


and a first order expansion as described above does not simplify 
the analysis. The degeneracy directions for F{p) = 1,3 fg dpp^, 
and 5 fg dpp'^ are displayed with black dashed curves in Fig. mI 
As p increases, the moments depend increasingly strongly on 
compared with ap. 

As the Legendre multipoles are simply linear combinations 
of power-law moments, the combination of the monopole and 
quadrupole will contain the same information as the combination 
of the monopole and the p = 2 power-law moment. Consequently, 
BAO fits to either the monopole and quadrupole or to the and 
p^ moments will provide the same information and, similarly, in¬ 
cluding either or hexadecapole and /r"* moment will add the same 
information. 


2.3 Fitting Wedges 


One could also consider setting F(p) to be a top-hat function in 
p, for example splitting the monopole into two components sep¬ 
arated at pd- Such moments have been termed ‘Wedges’ ( |Kazin| 
et al.|[20T2 2013 i. Using a subscript ‘1’ for F{p) = 1/pd for 
0 ^ p ^ Pd, and a subscript ‘2’ for F{p) = 1/(1 — pd) for 
pd ^ p ^ 1, one finds in real-space that 


m - Pa r, - 1 - 

WTi — g , rii — 1 g , 


m2 


(l-pg) 

3(1-Pd)’ 


n2 = 


(2 — 3pd + Pd) 

3(1-Pd) ’ 


(18) 

(19) 


which give the coefficients for both the approximation for ap of 
Eq. (|y and exact solution for a% given by Eq. 


Fig .|^displays the degeneracy directi 


17 


for the two wedges split at pd = 0.5 using red dotted curves. The 
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Figure 1. Straight-line curves denote the degeneracy direction between 
and for various (single) clustering moments. Black dashed curves de¬ 
note power-law moments dfj.fi'f’^(fi)/ , which we denote 

and red dotted curves denote Wedges. The solid ellipses denote the 1 and 
2 (t contours for the optimal combination of two moments, as derived in the 
following section. 


wedge with ^ < 0.5 constrains almost exclusively and the 
> 0.5 moment has a similar degeneracy as the power-law 
moment. 


2.4 Fitting the Quadrupole 

While the idea of measuring an average BAO position does not 
work with more general F{fj,) models, the primary source of sig¬ 
nal from the quadrupole is the strength of a feature proportional to 
the derivative of (see, e.g.,|Padmanabhan & White |2008[|Xu et| 
|al.|2013| >. Therefore, in real-space, where there is no RSD, the am¬ 
plitude of the BAO feature observed in the quadrupole carries the 
majority of the information on a| | and ax (as opposed to any other 
characteristic of the quadrupole). The amplitude of the quadrupole, 
relative to the underlying correlation function, depends on a|| and 
ax through 

1 9^2 (ar) 

C(r) aa|| 

( 20 ) 

1 d^2{ar) 

C(r) da± 

( 21 ) 


dlog^jr) 

dlogr 




— — l)d/r. 




dlog^jr) 
d log r 
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The integrals in Eqns. 1 20 1 & 1 21 1 reduce to ^ and — ^ respec¬ 
tively, showing that the dependence on a|| and ax is equal and 
opposite, suggesting that the measurement will constrain 


a\\ dlog^jr) 
ax d log r ’ 


to first order, matching the dominant term in the expansion of |Xu| 
|etaL| ( |20T3t . 


Table 1. BAO measurements on mocks as a function of fi. {a) is the mean 
recovered stretch pai'ameter (the relative BAO scale in that /r window), (cr) 
is the mean recovered uncertainty on a, and S is the standard deviation of 
the recovered a. 


fi range 

(a) 

A) 

S 

# 

0 < <0.2 

0.998 

0.022 

0.021 

0 

0.2 < /r < 0.4 

I.OOO 

0.021 

0.021 

1 

0.4 < /r < 0.6 

0.999 

0.019 

0.019 

2 

0.6 < fj, < 0.8 

1.001 

0.021 

0.020 

3 

0.8 < fi< 1.0 

1.003 

0.022 

0.021 
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3 ERRORS ON MEASURED MOMENTS 

If we can model the distribution of signal-to-noise of modes as a 
function of /r, we can predict the possible constraints one may ob¬ 
tain on a|| and ax- In redshift-space, on large-scales the modes 
have signal-to-noise that varies with fi, with the linear (1 + 
term increasing the amplitude of the power spectrum, which re¬ 
duces the impact of the shot noise along the los. Although the am¬ 
plitude of the modes are usually renormalised with the removal of 
the RSD during the reconstruction process, the signal-to-noise re¬ 
mains /i-dependent, as the “RSD removal” is effectively a renor¬ 
malisation of the redshift-space modes, rather than a removal of 
signal ( [Burden et al.|2014t . 

The window function will also affect the signal-to-noise as a 
function of /r in the correlation function by varying the pair num¬ 
bers, and in the power spectrum by reducing the number of inde¬ 
pendent modes. However, for samples such as BOSS CMASS, the 
window has a negligible effect, and the statistical distribution of 
pairs is close to being isotropic except on very large scales. On 
small scales, the BAO damping is asymmetric, and radial effects 
such as the Fingers-of-God (FoG) become important. Thus, we 
might expect the distribution of signal-to-noise to be a complicated 
function of 

We investigate the amount of BAO information as a function 
of jj. empirically, using the methods described in Section]^ and the 
post-reconstruction mock catalogues for the BOSS CMASS sam¬ 
ple, described in Section We split the data into broad bins in ^ 
and find the mean uncertainty and variance for BAO measurements 
in these bins. We present this information in Table[T] which shows 
that the BAO information is close to having an even distribution 
in fj. for the correlation function. Minima for the recovered uncer¬ 
tainty and standard deviation on the measured BAO scale are found 
in the 0.4 < /r < 0.6 bin. A potential explanation is that between 
0 < /r < 0.5 the effects of linear RSD boost the BAO signal, but 
at larger /i effects such as FoG remove information and reduce the 
signal-to-noise. Regardless, this minimum is shallow: the differ¬ 
ence in recovered uncertainty is at most 15 per cent and the results 
therefore justify our choice to treat the information as constant in 
F 

One may also worry about correlation between the clustering 
at different /r. For the power spectrum and an infinite volume, one 
expects no correlation between the clustering measured at different 
p. Once a survey window is applied, correlations will be induced, 
but for a survey the size of BOSS we expect these correlations to 
be small at the BAO scale. We measure the correlation between 
the BAO measurements in the five /r bins described in Table [T] and 
we display the correlation matrix in Fig.[^ We find the magnitude 
of correlations is at most 0.15, and we expect the power spectrum 
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Figure 2. The correlation matrix of the five BAO measurements made in 
0.2 thick fj, bins obtained from 600 BOSS CMASS mocks, as described in 
Table^(and numbered in the same manner). 


to be significantly less correlated than the correlation function. We 
therefore ignore any correlations between the BAO information at 
different /r in our analytical derivations that follow. 

The results we presented in this section suggest that, to a good 
approximation, one can treat the distribution in of BAO informa¬ 
tion in the post-reconstruction DRl 1 BOSS CMASS sample as the 
same as that of an infinite real-space volume. However, the distri¬ 
bution for any given survey may vary based on the particular survey 
geometry, satellite velocities of the galaxy population (which smear 
the BAO feature at high p), and the magnitude of the boost in clus¬ 
tering amplitude due to linear RSD effects (which boosts the high 
H signal). 


3.1 Complete sets of estimators 

Suppose that we have measured ameas,^ tn a series of (indepen¬ 
dent) bins in fj, (which we can treat as infinite in number), then 
fitting these measurements with parameters and would min¬ 
imise 


early depends on the parameters {cf\, and does not hold, for 
example, for fits to (a||, a±). 

Eq. I 23 I can also be written in terms of oif)^, with an 

inverse covariance matrix given by 


C“ 


1 


15cr^ 


3 2 
2 8 


(25) 


This can be calculated from the second derivatives of Eq. 


3.2 Predicted errors 


In Section [m we saw how the likelihood can be manipulated to 
understand constraints on and a\ from complete information 
(so the likelihood can be rewritten in terms of the new statistics), or 
the two p — 0 and p = 2 moments of those measurements. Eor fits 
to a|| and a± or for different moments, the likelihood derived is no 
longer complete. Instead, we recognize that for more general mo¬ 
ments of positive-definite functions T’i(/i), F 2 {p.) the covariance 
matrix is given b}[^ 

(^i,2 = ^o[ dfLFi{fi)F2(fi), (26) 

Jo 


and we use this formula throughout this section to determine the 
expected uncertainty on and covariance between a\\ and ai_ when 
using clustering measurements for pairs of F^p) windows. 

Eor a general power law moment, F{p) = (1 -f p)p^, Eq. 

= The covariance between an isotropic 


26 


yields 


weighting and an arbitrary one is cro,F = cq. This implies that, in 
our formulation, introducing a measurement over a second window 
in p as well as the monopole, does not provide extra information 
on the total stretch, it only provides a way to determine the radial 
and transverse components of the stretch. 

Assuming a combination of measurements for pi — 0,P2 = 
p, the radial and transverse stretch are given by (see Eqs. |^and[^ 


a± 


(ao 




= (ao ^ (27) 


and we obtain the expected uncertainty on a^, a|| 


2 

<^11 



-f 8p -f 10 
l + 2p 


(28) 


= [ dpaQ^ [/i^a|| -I- (1 - “ aLas.^] ^ > (23) 

where we have assumed that the value of a^e^s./j at a particular p 
can be represented by a Gaussian random variable with expectation 
0 and total variance ctq across all p. Furthermore, we have assumed 
that the noise is evenly distributed in p. 

The maximum likelihood estimator for (a^|, a^) can be cal¬ 
culated by finding the minima, solving the equations Vx^ = 0, 
where 


1 

Following Eq. |7b, the measured value f dp /r^a^eas,/j is a linear 
transform of that recovered from a moment of the 2-point func¬ 
tion with F{p) = 3p^, and similarly for the F{p) = (1 — 3/r^) 
moment. Looking at both “measurements”, we see that the max¬ 
imum likelihood points are fully determined by the p = 0 and 
p — 2 power law moments, or equivalently by the monopole and 
quadrupole. Note that Eq. l|24|> relies on the fact that the model lin- 


2 2 

M ^meas,/T 
_ 2 \ 2 
M /^meae 


= 2 


3 2 


- 


Jo 

fo 


_2 _2 (P + 13) (p + 1) 

8p + 4 ’ 


and covariance an 


(29) 


2P^ + 2p + 7 


(30) 


For p — 2, these equations reduce to the inverse of the matrix in 


Eq. 1 25 I. The variance and the correlation, C\\^± = crj|^x/(a||crx), 
are minimised for p = 2. Inspection of these results further reveals 
that they match those recovered in Section [TT] for the optimal so¬ 
lution. Thus, we recover the same results using these approximate 
formulae as recovered for the (not approximate) ML solutions to 
measurements of o?p, in the case where the ML solution is tested. 
We illustrate these results by plotting the 1 and 2a contours pre¬ 
dicted by these sets of covariances for p = 2 (black, solid), 4 (red. 


^ This is the general formula for covariance between the means of two 
Gaussian random variables with arbitrary F(p) weighting and variance Cg 
for F{p) = 1. It does not depend on the definition of a or p. 
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Figure 3. Ellipses showing the 1 and 2a contours for a|| and aj_, ex¬ 
pected when combining the monopole with a power-law weighted moment 
fg / fg which we denote for p = 2,4,6 (black, 

red, blue). 


dashed), and 6 (blue, dotted) in Fig.[^ The length of the minor axis 
stays nearly constant; which is in a similar direction to the mea¬ 
surement from the monopole (see Fig.[T](. 

For Wedges, Eq. 1 26 1 yields (ti ,2 = 0 and af = ao/pd, cri = 
crQ/(l — Hd). Given zero correlation between non-overlapping 
Wedges, in principle one may gain information by using an arbi¬ 
trarily large number of (non-overlapping) Wedges. However, we 
have shown that just two moments, equivalent to the monopole and 
quadrupole, form a complete set estimators. Thus, we investigate 
only the case where two, non-overlapping. Wedges are use(0 We 
predict the uncertainties and covariance as a function of the Wedge 
split, fid, to be 



2/^1 + Fd “ 

^ 1 

r o 21 

1 

\fld 

[ ^Ad{fAd-l) 

1 1 

1 - Md 

JXd + 1 



(31) 


2 

o-j. 


2 

= CTq 




^ 1 

\fid 

Wd-A 

i-fid 


Fd 


fid + 1 



(32) 


and 

_ 2 / (2 + Md — Sfid){fid — 1) _ Fd(3 — Md) \ 

°V (l - fid)ifid+ 1)^) ' 

(33) 

We evaluate Eqs. l |31| l and l |32| l for 0 < fid < i and com¬ 
pare the results to those recovered from the combination of ao, 02 
(equivalent to the information in the monopole and quadrupole). 
The results are shown in Fig.|^ One can see that variance is min¬ 
imised at fid = 0.64, but that the ao, 02 combination always per¬ 
forms better. We display similar information in Fig.|^ except that 
we now plot the correlation C||_x- Its magnitude is also minimized 


^ In the limit of infinite Wedges, the predicted uncertainties will clearly 
converge to that of the monopole and quadrupole 



Fd 


Figure 4. Red curves display the predicted uncertainty in «| | (dashed) and 
a± (solid) recovered using Wedges, as a function of the split in fi. The 
black curves display the predicted uncertainty for the combination of either 
the monopole and quadrapole, or the monopole and a fi^ window. 



Figure 5. The solid red curve displays the predicted correlation between a| | 
and Qx recovered using Wedges, as a function of the split in fi. The dashed 
black curve displays the predicted uncertainty for the combination of either 
the monopole and quadrupole, or the monopole and a fi^ window. 


at fid = 0.64 and is always greater than that of the ao, a 2 combi¬ 
nation. 

Tablej^summarises the predictions we make for the recovered 
uncertainty on a 11 , ax and its correlation. One can see that the pre¬ 
dicted uncertainties on aq and ax and their covariance are worse, 
by close to 10 per cent for each, for Wedges than for the combi¬ 
nation of ^0 and ^ 2 - We illustrate this same information in Fig. 
where the expected 1 and 2a contours are displayed for Multipoles 
(black, solid) and Wedges split dX fid = 0.64 (red, dashed) are dis¬ 
played. The major-axes of the ellipses are nearly aligned and it is 
along this direction that Wedges provide less-optimal constraints. 


4 BAG FITTING 

We use the same model to fit anisotropic BAG scale information as 
applied in|Anderson et al.|p014|l. We use only post-reconstruction 
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Table 2. The predicted uncertainty on the radial and transverse stretch, cr| j 
and relative to the uncertainty on the spherically averaged stretch, and 
their correlation, S denotes the standard deviation recovered from 

BAO fits to the mocks. denotes the correlation as recovered from 

the scatter of the BAO fits to the mocks. ‘W’ represents Wedges, and ‘M’ 
denotes the usage of ^ 2 - Compared to our predictions, the fits to the 
mocks are less precise but the overall trends agree. We discuss this further 
in subsequent sections. 


Method 

'^11 



S|| 

Sx 

sii,± 

M 

2.45(70 

1.50(To 

-0.41 

2.79(70 

1.58(70 

-0.49 

W, fid = 0.5 

2.85(70 

1.67(To 

-0.54 

2.98(70 

1.73(70 

-0.56 

W, fid = 0.64 

2.73(70 

1.62(To 

-0.50 

3.00(70 

1.66(70 

-0.54 



0.94 096 098 TOO T02 T04 1.06 

a j_ 


Figure 6. Ellipses showing the 1 and 2cr contours for q:|| and aj_, expected 
when using multipoles (or and power-law moments; black) and when 
using Wedges split at = 0.64. 


data and match all fiducial parameter choices to those used in |An- 
[derson et al.H20l'^ . We generate template ^(s) using the linear 


Piinjk) obtained from Camb using the same cosmology as Ander- 


son et al.i2Q14i(flat,f^m = 0.274, = 0.0224,= 0.7). We 


account for redshift-space distortion (RSD) and non-linear effects 
via 


P{k, fi) = F{k, / 2 , Es) (^Piine ^ AMcPMc{k)^ 


where 


and 


F{k,U,^s) = 


(l + fcV2Ei/2)2 


(34) 


(35) 


PMcik) = 2 J ^\F2{k-q,q)\^PuA\k-q\)Pnn{q) (36) 

and we fix Eg = 3/i“^Mpc, cr„ = 1.9fe“^Mpc and Amc = 0.05 
in all of our fits, as in [Anderson e t al. ( [2014^ . The motivation for 
these choices is discussed in Vargas Magana et^ ( |2013l l. 

Given P(fc, p), we determine the multipole moments 


9/ -I- 1 

Pdk)^^^ J ^P{k,^^)Ld^^), 


(37) 


where Lfd) are Legendre polynomials. These are transformed to 
6 via 

7^ f 

C{s) = ^ / dkCPdk)jdks) (38) 

We then use 

= (39) 

e 

and take averages over any given jj, window to create any particular 
template: 

5(s, a±,a^i)F ,mod(5) / (40) 

^0 

where f and s' = 

In practice, we fit for aF,oi\\ using ^o ,^2 and ^wiAw 2 , 
where W1 and W2 represent transverse and radial wedges split 
at either fid = 0.5 or fid = 0.64. When fitting to Wedges, we fit to 
the data using the model 

^wi ,mod ) = Biifvvi(s, Oil) + ^wi(s) (41) 

^W2 ,mod (s) — B 2 ^W 2 {s,a±,a^^) + Aw 2 {s), (42) 

where Aj:(s) = + ax, 2 /s + 

To fit Co 5 ^ 2 , we recognize Cz = 

5 Jq dfi [l.5fd^{fi) — OACn)) ™d, denoting dfifd^{fi) 
as Cm 2 , we fit to the data using the model 

Co,mod(s) = BoCo(s,Q!_L,a||) + 2lo(s) (43) 

C 2 ,mod(s) = 5 (l.5i3MCM2(s, a||) “ 0.5BoCo(s, q;||))+A 2 (s) 

(44) 

For all Bx, the parameter essentially sets the size of the BAO fea¬ 
ture in the template. We apply a Gaussian prior of width log(732:) = 

0.4 around the best-fit Bq in the range 45 < s < 80/i~^Mpc with 
Ax = 0; this treatment assumes the amplitude of the BAO feature 
is isotropic. 


5 EMPIRICAL RESULTS 


We use PTHalo ( jManera et al.|201^ mock galaxy catalogs (mocks) 
to empirically test our our analytical derivations. The mocks we use 
were created to match the SDSS-III jEisenstein et al.|201 1[ | data re¬ 
lease 11 (DRll) BOSS ([Dawson et al.|2013^ CM ASS sample. The 
imaging (Fukugita et al.jl99'^ Gunn et al.|1998[l and spectroscopic 


data I Smee et al.[[2013 i were obtained using the SDSS telescope 
( [Gunn et al.|2006| and reduced as described in [Bolton et al.[p012[ l 
The DRll CM ASS sample contains galaxies with fe ~ 2 
( White et al.|201l' i distributed over 8500 deg^ with 0.43 < 2 < 
0.7. The 600 PTHalo mocks created to match this sample are de¬ 
scribed in |Manera et al.| ( (20T3t and [Anderson et al.| ( [2014^ . Results 
for Wedges and Multipoles fitting to these mocks have previously 
been published in [Anderson et al.| ([2014^, and we use the same 


Anderson et al.|f2014^. We 


post-reconstruction pair-counts as m 
bin ^(s) in s bins of width 8 ft“^Mpc, matching the fiducial choice 
of [Anderson et ai;] ( [2014^ that was determined optimal in [Perciv^ 
[et al. ( 2014|(. We calculate ^{s, fi) m fi bins of width 0.01 using the 
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Table 3. The statistics of BAO scale measurements recovered from the DRll mock samples. ‘A14’ results are taken from Anderson et al. (2014). All values 
are recovered from the distribution of the fits to the 600 mocks; () denote the mean values, S denotes standard deviation, and Cj | ^ the denotes the correlation 
between the maximum likelihood values of a||, cnj^. 


Publication 

Method 

(«±) 


5x 

(«||} 


5|| 


Anderson et al. (2014) 

M 

0.9999 

0.0137 

0.0149 

1.0032 

0.0248 

0.0266 

- 


W, fid = 0.5 

0.9993 

0.0161 

0.0153 

1.0006 

0.0296 

0.0264 

- 

This work 

M 

0.9987 

0.0150 

0.0145 

1.0017 

0.0232 

0.0257 

-0.49 


W, fid = 0.5 

0.9992 

0.0159 

0.0157 

1.0010 

0.0274 

0.0274 

-0.56 


W, fid = 0.64 

0.9980 

0.0153 

0.0152 

1.0032 

0.0274 

0.0276 

-0.54 



Figure 7. The mean 3 d/i/i^^(//), denoted recovered from 

post-reconstruction DRll CMASS mocks. 

Landy & Szalay|<|1993| t method, modified for reconstruction l |Pad-| 
manabhan et al.|2012[ t, 

DD{s,fi)-2DS{s,iJ.) + SS{s,iJ.) 

RR{s,f,) ^ 

where D is the reconstructed data points, i? is a set of points ran¬ 
domly sampling the angular and radial selection functions, and S 
is a separate set of these random points whose positions have been 
shifted by the reconstruction according to the reconstructed density 
field ( [Padmanabhan et al.|2012[ l. We then determine the correlation 
function for any particular window over jj, via 
100 

Cf(s) = y^0.01C(s,pi)-F’(Pi). (46) 

i = l 

where fit = O.Oli — 0.005. 

Fig|^displays the mean recovered from these mocks post¬ 
reconstruction (black curve) compared to the mean 3 
moment (red curve). In principle, they should appear identical, as 
RSD have been removed in the reconstruction. However, differ¬ 
ences are observed that are similar to the differences observed in 
post-reconstruction Wedges (see, e.g., figure 19 of | Anderson et al.| 
|20T4l l. 

We measure the line of sight, a||, and transverse, a±, BAO 
scale information for each of the 600 mocks using three different 
pairs of observables: 

(i) The combination of and ^ 2 , as described by Eqs. |43|and 
|44[ we denote these results as ‘M’ (for Multipoles) 


(ii) The combination of and ^w 2 Wedges split at fid ~ 0.5; 
we denote these results as ‘W, fid = 0.5’ 

(iii) The combination of and ^w 2 Wedges split at fid ~ 
0.64; we denote these results as ‘W, fid = 0.64’ 

For both Wedges, we use the model described by Eqs.|41|and|42| 

Our results are shown in Table where we also display the 
results from |Anderson et al.H2014t , denoted with ‘A14’. One can 
see that our implementation of Wedges split at fid = 0.5 and Mul¬ 
tipoles generally match closely with | Anderson et J^ ( |2014| ), though 
variations of up to 10 per cent are found for some standard devia¬ 
tions and mean uncertainties. 

The uncertainties and standard deviations are slightly worse 
than our analytic predictions, as can be seen by comparing the three 
left-hand columns to the three right-hand columns in Table The 
discrepancies are greatest for a|| and for Multipoles; the recovered 
standard deviation on a|| is 14 per cent larger than expected for 
Multipoles, which is likely related to the fact that the correlation 
between Q!j_ and a|| is 20 per cent larger than expected. Despite 
not matching our quantitative predictions, the Multipoles fits still 
match our qualitative predictions: they recover the smallest stan¬ 
dard deviations, mean uncertainties, and correlation between 
and aj_. 

The Wedges split at fid = 0.5 produce the results closest to 
our analytic predictions; the recovered a||, and their correla¬ 
tion are all between 3 and 5 per cent greater than predicted. We 
find that Wedges split at fid = 0.64 results in only a small im¬ 
provement in the variance of and the correlation between a±_ 
and a 11 while producing a slight increase in the variance of q 11 . The 
fid = 0.5 Wedges recover the least biased mean ax and a\\ of the 
three methods we apply, though the difference in the bias compared 
to the Multipoles results is negligibly small (at most 0.034 (t). 

The results of our fits to the mocks are illustrated in Fig. 
where we take the standard deviations and correlations of a|| and 
ax for the different fitting techniques we apply and assume Gaus¬ 
sian statistics. Compared to Fig.|^ one can see that the ellipses are 
all more elongated (reflecting the increased uncertainty on a|| over 
those predicted). Similar to our predictions, the Multipoles ellipse 
is significantly smaller than the Wedges ellipses. 

Table|4]lists the correlations we find between the three differ¬ 
ent treatments we consider for a|| and ax, and the standard devia¬ 
tion obtained when averaging the measurements accounting for the 
correlation. The correlations are all greater than 0.85, and differ¬ 
ences from 1 are caused mainly by the relative precision achieved 
in each method. Thus, there is negligible gain achieved by aver¬ 
aging the measurements, as one can see that there is at best a one 
per cent gain in the precision over that achieved by using only the 
Multipoles results. 
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a_L 


Figure 8. Ellipses showing the recovered standard deviation and correla¬ 
tions between the Oj | and ox for the different fitting techniques we apply, 
produced assuming these statistics describe a multivariate-Gaussian likeli¬ 
hood distribution. 


Table 4. Correlations between the recovered or q:|| for different meth¬ 
ods and the expected uncertainty when averages are taken incorporating the 
con'elation. C denotes comelation and Sc denotes the standard deviation 
after combining the two measurements, accounting for the correlation. 


Methods 



c± 

‘S’c.li 

Sc,± 

Wedges pd 

= 0.5; Multipoles 

0.85 

0.86 

0.0254 

0.0144 

Wedges pd 

= 0.64; Multipoles 

0.89 

0.90 

0.0256 

0.0144 


6 CONCLUSIONS 

We have derived analytic formulae that describe the relative im¬ 
portance of angular and radial dilation measured from moments of 
2-point clustering statistics, with respect to the cosine of the angle 
to the line of sight /r. We have derived formulae for an arbitrary 
window, F, that weights information with respect to fi, and have 
provided solutions for the cases where F{fi) is a power-law (Sec¬ 
tion |2!^, a “Wedge” where F{^) is 1 for given range of fi and 
zero otherwise (Section |2.3^ , and where the window is the 2nd- 
order Legendre polynomial (i.e., the clustering observable is the 
quadrupole moment; Section |2.4| l. We have presented results in 
real-space, valid for both the correlation function, and in Fourier 
space for moments of the power spectrum. These formulae extend 
the commonly used assumption that isotropically averaged BAO 
provide a measurement of Dv{z) to other moments and allow for 
RSD when using power spectrum moments. 

In Section]^ we derive the expected uncertainty of, and co- 
variance between, ax and Q|| obtained from a pair of clustering 
measurements calculated for two different F{fi) assuming that in¬ 
formation is evenly distributed in (as is approximately the case 
for the BOSS CMASS galaxy sample). We show that the optimal, 
maximum likelihood solution is the combination of the monopole 
and quadrupole, or equivalently the monopole and F{fj,) — fd. We 
show that a third power-law window only adds degenerate infor¬ 


mation and should not increase the statistical precision on Da (z) 
and H{z). We then find the optimal combination Wedges, which 
we find are those split at fid = 0.64. For this optimal Wedge, we 
predict the uncertainties on and correlations between Da{z) and 
Ft {z) are between 8 and 11 per cent larger than for the combina¬ 
tion of the monopole and quadrupole. 

Our results differ from those of |Taruya et al.| ( [2011| ); |Kazin et| 
[^P012l l, as both studies found that including the hexadecapole 
significantly decreased the recovered uncertainty on Da{z) and 
Ft (z). The key difference in our study is that we derive our results 
for post-reconstruction galaxy clustering measurements, where the 
Legendre polynomial moments are expected to be zero, except for 
the monopole. Thus, in our analytic formulation (supported by our 
empirical results), the inclusion of the quadrupole does not increase 
the total amount of BAO scale information (the covariance between 
the BAO information in the p = 2 moment and in the monopole is 
the same as the variance expected for the p — 2 moment), it sim¬ 
ply allows for the information to be optimally projected into the 
Da {z),F[{z) basis (and therefore does increase the total amount of 
cosmological information), and thus there is no additional informa¬ 
tion in the hexadecapole. In redshift-space, as studied by |Taruya et| 
[n^ ( |201 l| l; |Kazin et al.| ( |2012^ , the quadrupole and hexadecapole are 
expected to be non-zero and thus do contribute to the total amount 
of BAO information. 

In our derivations, we consider only the /r-dependent dilation 
at a particular scale and assume the information at particular p is 
independent. Such an assumption may be more appropriate in k- 
space, where P{k, pi) and P{k, p 2 ) are expected to be indepen¬ 
dent (not accounting for any survey window function), but we test 
our derivations using the redshift-space correlation function, where 
^(s, pi) and ^(s, p 2 ) are not independent. Despite these assump¬ 
tions, the results we recover from test on mock samples closely 
match our predictions, especially for ax, as presented in Section]^ 

Using the set of mock catalogues produced for the BOSS 
DR 11 analysis, we find that, as predicted, in terms of the recov¬ 
ered uncertainty of, standard deviation of, and covariance between 
a|| and ax, fitting to Multipoles produces the optimal results of 
the three cases we test, matching our analytic predictions. We also 
find, as predicted. Wedges split at = 0.64 are optimal compared 
to Wedges split at pd = 0.5, although the decrease in uncertainty 
is small (< 5 per cent). We find that the correlation between Multi¬ 
poles and Wedges is large enough that there is a negligible gain in 
information ( 1 per cent reduction in the standard deviation) when 
the results are combined. 

We find a slight trend where the methods that depend most 
strongly on clustering measurements at high p are the most biased. 
The bias is small, as the largest bias, found for the pd = 0.64 
Wedge, is only 0.13a. This trend is thus likely due to inaccuracies 
in our modelling of the BAO feature at high p, where the non-linear 
RSD signal is strongest. If the modelling as a function of p can be 
improved in future analyses, we expect the trend in bias will de¬ 
crease and that the recovered uncertainties and correlations will be 
a closer match to our predictions for Multipoles. We therefore be¬ 
lieve that improving the p dependence of the post-reconstruction 
BAO template should be a priority for future BAO studies, and that 
by doing so, the precision of the measurements made using Multi¬ 
poles will increase. 

Our analysis provides further support for the future use of 
BAO to make robust cosmological measurements. We have care¬ 
fully considered the meaning of BAO measurements made from 
moments of 2-point functions, providing an optimal approach. Both 
this work, and the recent work of|Zhu et al.|(|2014Tl who considered 
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radial weighting of BAO measurements, are testing and optimising 
the BAO measurement methodology, increasing our understanding 
in line with the increasing statistical precision afforded by future 
surveys. Our results, and the conclusions we draw, are specific to 
the case where information is evenly distributed in fj,. Thus, inter¬ 
esting possible extensions include extending the methodology to 
more general cases with different distributions of information with 
H (e.g., Lya or redshift-space measurements determined without 
using reconstruction), and allowing for correlations in jj. in the co- 
variance matrix of ^(r) required for small surveys. Such studies are 
likely to find that more than two moments are required to capture 
the full information content of the BAO signal. 
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